AITopics | target encoder

0918183ced31affb7ce0345e45ac1943-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 11:08:30 GMT

We evaluate Okapi using three datasets - iWildCam, PovertyMap, and CivilComments - taken from the WILDS 2.0 benchmark [63]. These datasets were chosen specifically due to the poor performance reported by [63] for semi-supervised and domain adaptation methods across the board, in relation to the ERM baselines. For PovertyMap in particular, ERM was found to vastly outperform any competing methods utilising the unlabelled data and/or domain labels. The task is multiclass species classification of animals in camera trap images. The dataset contains 1022K images of animals annotated with the domain, s, that identifies the camera trap that captured it.

artificial intelligence, encoder, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

6274172f7d981a8d58bbfd52342a9d1f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 17:26:06 GMT

artificial intelligence, machine learning, representation, (13 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

0918183ced31affb7ce0345e45ac1943-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 08:56:28 GMT

dataset, encoder, okapi, (14 more...)

Neural Information Processing Systems

Country: Africa (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ConnectingJoint-EmbeddingPredictiveArchitecture withContrastiveSelf-supervisedLearning

Neural Information Processing SystemsFeb-7-2026, 07:09:13 GMT

Figure 1: Our C-JEPA achieves faster and betterconvergencethanI-JEPA. Unsupervised learning ofvisual representations has recently seen remarkable progress, primarily due to the development of innovative architectures and strategies that exploit unlabeled imagery.

artificial intelligence, machine learning, zyj, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.34)

Add feedback

6274172f7d981a8d58bbfd52342a9d1f-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 23:19:38 GMT

artificial intelligence, machine learning, representation, (13 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Connecting Joint-Embedding Predictive Architecture with Contrastive Self-supervised Learning

Neural Information Processing SystemsOct-9-2025, 17:26:28 GMT

Our contributions are manifold and significant. Firstly, we identify and articulate the limitations inherent in the I-JEP A framework, specifically its EMA and prediction mechanisms.

attention map, c-jep, representation, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Heilongjiang Province > Daqing (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Unsupervised Training of Vision Transformers with Synthetic Negatives

Giakoumoglou, Nikolaos, Floros, Andreas, Papadopoulos, Kleanthis Marios, Stathaki, Tania

arXiv.org Artificial IntelligenceSep-3-2025

This paper does not introduce a novel method per se. Instead, we address the neglected potential of hard negative samples in self-supervised learning. Previous works explored synthetic hard negatives but rarely in the context of vision transformers. We build on this observation and integrate synthetic hard negatives to improve vision transformer representation learning. This simple yet effective technique notably improves the discriminative power of learned representations. Our experiments show performance improvements for both DeiT-S and Swin-T architectures.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.02024

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.37)

Add feedback

Audio-JEPA: Joint-Embedding Predictive Architecture for Audio Representation Learning

Tuncay, Ludovic, Labbé, Etienne, Benetos, Emmanouil, Pellegrini, Thomas

arXiv.org Artificial IntelligenceJul-8-2025

Self-Supervised Learning ( SSL) has revolutionized representation learning for speech and audio, enabling models to learn from unlabeled data and excel in diverse downstream tasks [ 1, 2, 3, 4 ] . Early SSL approaches for audio, such as contrastive predictive coding and wav2vec 2.0, learned latent speech representations by masking the input and solving a contrastive task over latent codes [ 5 ] . Follow-up methods like HuBERT [ 1 ] introduced offline clustering to generate pseudo-labels for masked audio segments and WavLM [ 6 ] applied data augmentation and denoising to improve robustness in speech representation learning. More recently, latent prediction approaches have gained traction: data2vec [ 7 ] and its efficient successor data2vec 2.0 [ 8 ] employ a teacher-student framework to predict contextualized latent representations of the input, achieving strong results across vision, speech, and language tasks. In the audio domain, Niizumi et al. introduced Masked Modeling Duo (M2D) [ 4 ], which uses two networks (online and momentum encoder) to predict masked patch embeddings and attained state-of-the-art results on numerous audio benchmarks. In computer vision, a new paradigm called Joint-Embedding Predictive Architecture (JEP A) [ 9, 10, 11 ] has been proposed to predict hidden content in a high-level latent space instead of pixel space.

artificial intelligence, arxiv, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2507.02915

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > France (0.04)

Genre: Research Report (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

BESA: Boosting Encoder Stealing Attack with Perturbation Recovery

Ren, Xuhao, Liang, Haotian, Wang, Yajie, Zhang, Chuan, Xiong, Zehui, Zhu, Liehuang

arXiv.org Artificial IntelligenceJun-6-2025

--T o boost the encoder stealing attack under the perturbation-based defense that hinders the attack performance, we propose a boosting encoder stealing attack with perturbation recovery named BESA. It aims to overcome perturbation-based defenses. The core of BESA consists of two modules: perturbation detection and perturbation recovery, which can be combined with canonical encoder stealing attacks. The perturbation detection module utilizes the feature vectors obtained from the target encoder to infer the defense mechanism employed by the service provider . Once the defense mechanism is detected, the perturbation recovery module leverages the well-designed generative model to restore a clean feature vector from the perturbed one. Through extensive evaluations based on various datasets, we demonstrate that BESA significantly enhances the surrogate encoder accuracy of existing encoder stealing attacks by up to 24.63% when facing state-of-the-art defenses and combinations of multiple defenses. Pre-trained encoders are extensively utilized across various domains in real-world scenarios [1]. However, training well-performing pre-trained encoders is a time-consuming, resource-intensive, and costly process [2]. Hence, encoder owners are highly motivated to safeguard the privacy of their pre-trained encoders. Unfortunately, recent works have shown that pre-trained encoders are susceptible to encoder stealing attacks [3]. These attacks allow an attacker to create a surrogate encoder that closely mimics the functionality of a targeted encoder by simply querying it through the APIs. The consequences of such attacks can be quite severe.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.04556

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > Singapore (0.05)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Building Bridges between Regression, Clustering, and Classification

Stewart, Lawrence, Bach, Francis, Berthet, Quentin

arXiv.org Machine LearningFeb-18-2025

Regression, the task of predicting a continuous scalar target y based on some features x is one of the most fundamental tasks in machine learning and statistics. It has been observed and theoretically analyzed that the classical approach, meansquared error minimization, can lead to suboptimal results when training neural networks. In this work, we propose a new method to improve the training of these models on regression tasks, with continuous scalar targets. Our method is based on casting this task in a different fashion, using a target encoder, and a prediction decoder, inspired by approaches in classification and clustering. We showcase the performance of our method on a wide range of real-world datasets.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

2502.02996

Country: Europe > France (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

target encoder

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

0918183ced31affb7ce0345e45ac1943-Supplemental-Conference.pdf

6274172f7d981a8d58bbfd52342a9d1f-Paper-Conference.pdf

0918183ced31affb7ce0345e45ac1943-Supplemental-Conference.pdf

ConnectingJoint-EmbeddingPredictiveArchitecture withContrastiveSelf-supervisedLearning

6274172f7d981a8d58bbfd52342a9d1f-Paper-Conference.pdf

Connecting Joint-Embedding Predictive Architecture with Contrastive Self-supervised Learning

Unsupervised Training of Vision Transformers with Synthetic Negatives

Audio-JEPA: Joint-Embedding Predictive Architecture for Audio Representation Learning

BESA: Boosting Encoder Stealing Attack with Perturbation Recovery

Building Bridges between Regression, Clustering, and Classification